PLL.PRO[P,JRA] - www.SailDart.org

perm filename PLL.PRO[P,JRA] blob sn#131165 filedate 1974-11-18 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00009 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	PROPOSAL FOR PROG. LANG. LAB.
C00003 00003	rather than compiler, would suggest  research into online
C00006 00004					ideas
C00008 00005			outline: programming language lab for data structure manipulation
C00015 00006	WHY LISP-LIKE: lisp, el1 or variant
C00017 00007	THE LANGUAGE
C00021 00008	INTRODUCTION
C00069 00009	WHY NOT OTHER LANGUAGE
C00070 ENDMK
C⊗;
PROPOSAL FOR PROG. LANG. LAB.

Verifying compiler
	research into 
	 on-line verification
	 extension of axiomatic semantics to new constructs
	 extension of axiomatic semantics to new programming units
	 standard techniques and debugging aids
	 symbolic interpretation
	 simplification
	 verification languages
rather than compiler, would suggest  research into online
system. since compiler suggsts a degree of "finality" which 
is perhaps premature.  That is the debugging-verification process
should really occur simultaneously with construction, and with 
compilation used only to finalize the correct result. 
The only reason for compiling is to increase
speed; ther is no conceptual advantage. Thus the  creative work  really
precedes compiaation-- perhaps "verifying interpreter" better states the
area which needs investigation.

consider partially verified prog... make change/addition... system should
be able to tell what conditions have changed... parts which were correct
may no longer be correct or may need further qualification.

it is not clear to me that ANY existing language is well suited to
such interactive development..perhaps EL1...LISP is too poor in 
data structures; and from only a passing glance, PASCAL, PL/1, ALGOL68,
type languages are too ad-hoc and batch oriented.

REDUCE has rudimentary on-line features but rather muddled semantics it
appears.  
Partial evaluation should be available like REDUCE.. 
form-valued variables  are a step in the right direction.
The general question of partially constructed programs and what can be said
about their correctness is difficult. What programming blocks(modules) are
"minimal" that is, what are the blocks which can be used with the assurance
that whatever assertions hold cannot be violated by any actions outside
the block...clearly  tied up with global variables....but trying to
write itereesting defintions without globals is difficult.

?symbolic interpretation is just evaluation of form-valued variables
in the absence of any knowledge.? YES!!!!
				ideas


should be difference between interactive languages and batch languages
should be difference between numerical languages and non-numerical languages

DIATRIBE ABOUT TIME TO DO PRACTICAL SYSTEM.

NEW PROG CONSTRUCTS: CONTENTION THAT PROG || DATA STRUCTURE: GENERIC, AND ON.

DIATRIBE AGAINST ASSIGNMENT:
  NAMES TEMPORARY QUANTITY, NEEDED BECAUSE ALGORITHM IS GETTING TOO COMPLICATED TO
REMEMBER DETAILS.
  THROW-BACK TO HARDWARE MACHINE
SHOULD BE ABLE TO DO WITHOUT IF D.S. ARE CAREFULLY DEFINED.

notice implicit difference in struct vs. seq.

diatribe about lisp: everything as tree; and algol seq.

show inadequacy of landin(p42)...TYPES go in structure definition, NAMES go into
abstract syntax which are struct.

eg "closure has a b_v part and a λ_body and an env"
should be closure ::= λ<var><exp> <env> ==> closure = struct[var:bvp;exp:bdy;env:env]
		outline: programming language lab for data structure manipulation
introduction
 time is ripe to develop such a lab
 growing sophistication on programming construction
 application to CAI: structured programming techniques
 relation to abstract d.s. and  VHLL
 clearer understanding of verification and construction
  this involves primarily  interpreters rather than compilers 
 involves new or modified language features: cf syntax directed compilers for
   Fortran vs. Algol, or LISP compiler.
 should be interpreted-based rather than compiler-based
 should be LISP-like to allow easy manipulation of programs, either by
the user or automatically


 what is needed and  what should be expected

  on-line verification

  on-line debugging
   that non-trivial programs cannot be expected to be automatically
    contructed with current formal systems
   that n-t programs cannot be expected to spring bug-free from the
    Creator
   debugging, like counterexample production in mathematics, is a useful
    means of arriving at a correct program.
   bugs come in at least two flavors (1) misunderstanding of basic 
     algorithm; (2) errors in  represenation.
        type(2) can be alleviated by careful data-structure repesentation
        type(1) by program constructs which are close to the intuitive understood
         algorithm.

  on-line construction
   besides obvious simple editing aids, monitoring of contruction can at least
     be done; e.g. type checking
   bookeeping of simple consequences of construction actions can be done;
     e.g.; modification or construction of program can have effect on
     extant code and assertions; much of this can be  monitored for consistency.

  on-line evaluation 
   use of evaluators is a well know technique for giving credence to
    program, indeed it is the desired end result.

  on-line simplification
   closely related to evaluation is symbolic interpretation, and to
    generate anything but a muddle, simplification of result is required


 how current approaches are lacking
  inadequacy of languages
    the three-bear problem: too much or not enough
      special requirements for data structued definitions and programming
       user defined structures
        significance for proofs and verification
       carefully chosen operations
        significance for proofs and verification
       efficient implementation
    no consistent study of forms,evaluation and simplification
      REDUCE
      Boyer-Moore
      Lombardi-Raphael
    form-valued variables as extension of sub-tree replacement systems

  indaequacy of environment
   most languages are batch
   most languages look the same: numerical and non-numerical
   need techniques and ultimately commands for interpreation, partial evaluation
     and general control of the construction, debugging and verification process
   need proof-checker and simplifier for form-valued evaluation


 proposed solution
  want a language (subset) which is useable, sufficiently concise that we
      can attack provability results with some assurance of success.
  structured LISP: a language which allows manipulation of abstract
      representation of programs is necessary.
   user defined data strucutres
    very large part of non-numerical LISP programming involves d.s.
     (1) simulation of data structures
     (2) (simulated) type checking
    separation of d.s. from program is a natural requirement; declaration
      of d.s. should percipitate axioms for constuctors, sel, and preds,
   programming constructs reflecting data-structuring
    the constructs should be tailored to type of ds.
     one for structures or records
     one for sequences
   form valued variables
    built-in as part of basic evaluator; this allows program manipulation
    the evaluation is  a treee transformer with each environment being a
    set of local rules
   a program checker and simplifier for symbolic interpretation
     a sub-tree replacement system
   future: a program prover
     a clean and consistent prover in the spirit of Boyer-Moore


car-cdr chains should NEVER occur in LISP program.
(1) simulating d.s.
(2) accessing subcomponents
(3) type checking
(4) primitive user functions
WHY LISP-LIKE: lisp, el1 or variant

evaluator-based
program as data: everyone manipulates abstract syntax rep

WHAT IS MISSING FROM LISP
user d.s.
	hoare all we need is seq and set and struct takes care of set.

simplification


BENEFITS
extensible
portable


PROBLEM AREAS NEEDING STUDY
ABSTRACT DS
ABSTRACT PROG
ABSTRACT CONTROL: DIRECTLY TIED TO A. D.S.
appears that proper characterization of a.d.s requires some specification of 
control. compare "set"; as a ads. we are free to declare it as atype and call
functions, postponing a decision about its rep until later. or are we?
when writing functions about sets, we must use some control structutes of the
language; but common languages have FOR, WHILE,and recursion, all of which
operate on existing (concrete) d.s.. Thus writing algorithm using set vars but
existing control  is a fraud, you have already really determined the concrete d.s.

Thus the problem requires that we be able to write algorithms using
abstract data structure AND abstract control, THEN be able to describe
a map of abstract d.s to concrete, and abstract control onto specific control.

PROGRAM IMPROVEMENT ABSTRACT ALG TO EFFICIENT AUTOMATICALLY OR OTHERWISE.

AREAS UNDERSTOOD, BUT NEED TO BE DETAILED
THE LANGUAGE

The  crucial and  most difficult  area  to examine  the  that of  the
language  in which we  wish to  specify our algorithms.  The question
must be  examined from  many sides.   1.  The language  must allow  a
realistically  large   selection  of   algorithms  to   be  specified
naturally.   2. The language must be sufficiently well strutured that
practical proof  techniques may be  developed so  that we can  reason
about  our  programs.     3.  The  language  must  have  an  efficient
implementation.  4. The data structures of the language must  be rich
enough that we can describe an efficient interpreter for the language
in the language. 

As  diverse as  these requirements  seem, it  is not  unreasonable to
expect a  solution.   What  is  perhaps unreasonable  to  expect, but
certainly lamentable,  is that an existing  programming language will
suffice.  Why, after  all the years  of experience  with languages is
language design is such a  sorry state?  The difficulty  stems from a
prolonged preoccupation with syntax and efficency of execution. These
areas --syntax  analysis  and  compilation--  are  inportant  to  the
development of  a viable  language, but  these areas  have been  well
understood for  a number of years and research  should push on to the
difficult area between syntax and execution-- that of semantics. 

Before we can adequately address the question of  program correctness
we  must be  able to  characterize the  meaning of  a program.  It is
unsatisfactory  to  simply  say  that the  meaning  of  a  program is
whatrever happens to its input...  A program is not best described as
a black  box impervious to our understanding,  only giving output for
some of  its  inputs.   An  integral part  of  our  responsibility as
researchers is to be  able to describe how the program  works as well
as  just what is does  as a function  of input and output.  If for no
other reason, we  must be  able to  influence the design  of our  own
machines. The  way to do that  is to recognize those  language design
constructs which  are valuable and then require that they become part
of the standard hardware repitorie.   (stacks...B5000-8000 design) We
are not  advocating the construction of a  machine for any particular
language, but surely there are  more building blocks for  programming
languages than just stacks. 

INTRODUCTION

The  overall goals  of  a  programming  language lab  should  be  the
development  of  a highly  useable  system to  aid  in  the creation,
debugging,  verifying  and  execution  of  programs  expressed  in  a
realistic programming language.  The computing community is painfully
aware of  the inadequacies of the current program-run-debug cycle. As
programs become  more sophisticated  and computers  become even  more
pervasive, the problem cannot help but get worse.  The problem, often
called   the   software   crisis,   is   primarily   conceptual   not
technological.   Faster  or bigger  machines will  not alleviate  the
difficulty.    We propose  a  unified  approach to  the  design  of a
programming environment to  guide all  phases of program  development
from conception  to a  completed program  which satisfies  its design
specifications.   We  are not proposing  to examine "knowledge¬based"
programming. We  do not  expect the system  to be  conversant in  the
subject matter of our programming domain. It seems quite premature to
expect significant  results in  such  undertakings. Rather  we  would
expect the system to be knowlegeable of  the programming language and
be able  to manipulate program modules under  user control. It should
be able to  answer intelligent questions not  necessarily be able  to
ask them. Systems which try to read minds are seldom successful. 

The issue of  useability is crucial to the success  of this project. 
We  wish  to  present  this  system  to  working  programmers  as  an
alternative to  current on-line edit-debug-run  systems. Such a  user
must  be  comfortable  with  the  interface  which will  be  used  to
manipulate  programs  of   the  programming   language.     Providing
unwieldly, clumsy, or stilted control  is unacceptable. The responses
provided  by the control program  must also provide  information in a
form understandable by the programmer.  Giving him  information about
his  program in  a  form  which is  not  immediately  related to  his
programming  constructs is unacceptable.  (compare clausal OUTPUT for
a theorem-prover which  accepts "natural INPUT".   or "dot  notation"
output with list notation input)

Similarly  the  programming language  which  we will  use  as a  test
vehicle must be sufficiently powerful so as to allow construction  of
"reasonably natural" programs.   Certainly this second  criterion can
be  satisfied  simply by  providing  one of  the  currently available
languages. However this seems unacceptable.  All  current programming
languages allow the  construction of programs which  far outstrip our
abilities  analyze  their  formal  properties.    We  can too  easily
construct  programs  which  are  so  ill-structured  or  opaque  that
analysis of their correctness is  not practical. It seems more to the
point then to develop a  powerful dialect of some existing  language,
representing a subset which is amenable to proof techniques. Then, as
our  understanding of verification  increases and  our techniques for
program construction improve, we can make extensions of  the original
language.  However  the original dialect should be chosen  so that we
can  speak  with  some assurance  about  the  properties  of programs
contained in it. 

Compare the  development of  programming language  syntax and  syntax
directed translation. The syntax of Fortran was developed in a rather
ad  hoc   manner,   before  an   understanding   of   syntax-directed
translation. Thus compilers for Fortran tend  to contain many special
tests   for  syntax  anomalies,   and  generally  tend   to  be  less
"structured"  than  compilers  for   Algol-like  languages.     Algol
translators  profitted from  an understanding  of formal  methods for
syntax  translation;  these  translators  are  more  "structured" and
easier to write. The expressive power of Algol did not suffer because
we imposed some requirements  on the design of the syntax. Similarly,
"semantic" (better, "pragmatic") restrictions  have been included  in
several languages;  a  typical case  being to  maintain a  stack-like
discipline in the activation of program modules in Euler.  

It  therefore does not seem  unreasonable to consider "certification"
restrictions when  designing  languages.    If we  expect  to  submit
programs to  proof techniques it  does not  seem at all  unnatural to
require  that the language  constructs match the power  of our formal
methods, any more than requiring that the syntax match our ability to
produce  efficient  parsings  or  that  programming  constructs  have
efficient run-time representations.   Already  such realizations  are
apparent on a small scale; compare the parameter passing restrictions
of Pascal. 

To  unduly raise user-expectation  by supplying a  language which far
outstrips our ability to semantically analyze its programs  will lead
to  immediate  and  valid  criticism.   Just  as  allowing  syntactic
contructs which require an undue amount of processing to discover the
parse are discouraged by syntax directed methods. 

This comparison of  syntax and semantics  in the problem of  language
design  is  not unfair,  we  believe.   The  active participation  of
semantic methods in the process of language design is much  overdue. 
If the current semantic  techniques are not powerful enough  to allow
natural description of  a useful language and to formulate reasonable
proofs, then we should know it.  We should no longer be  satisfied to
simply apply semantics to the analysis of existing languages. 

What are desireable features of  a programming language lab?  Ideally
we  would wish to  have a cradle-to-grave  programming environment in
which a user can (1) describe an algorithm, (2)  construct a program,
(3) experiment with  and (4) debug that program and  (5) modify it if
desired;  (6)  verify  that   the  program  realizes  the   described
algorithm, (7) ask for  improvements to the program, and  (8) compile
and  finally (9)  execute  the correct  program.   One  could further
desire that (10) an automatic program generator be available  to take
the  initial algorithmic  description  and present  the  user with  a
complete  program, guaranteed to  be correct and  efficient.  However
the thrust  of this  proposal is the  development of  well-structured
propgrams  through user interaction.   The "structuring"  of programs
derives from  the process  of manipulating  programs from  a  concise
algorithm to an efficient  program, rahter than from the  presence or
absence of  constructs in a particular language.   In essence we look
upon  a computer  as  a  manipulator  of  structures  rather  than  a
manipulator  of symbols  or numbers.   One  of the  useful structures
which we wish to manipualte is a program. 

Obviously  there is  much interation between  the system components.  
Indeed, redundancy is apparent as well. The point is not to produce a
"minimal" system,  but to  produce one which  is useable and  gives a
natural  environment  in  which  to  work.    Before  suggesting  the
abandonment of a well-established programming  technique we must show
that its use is either detrimental to good program design, or we must
produce an alternative  of superior merit.   Similarly, in  selecting
command facilities  to control the programming  environment we should
make  a concerted effort to include  standard techniques in a natural
setting.  Simply because the compiler could check  for syntax errors,
does  not mean  that  we should  postpone syntax  analysis  until the
program is to be executed. Simply because a program verifier can tell
us about some of the semantic errors doesn't  mean that we should not
expect  the  system  to tell  us  sooner  if possible.    We  are not
advocating a "clever  system" which  can "read minds",  but a  system
which  "knows" about  the programming  language (not  necessarily the
program)is  feasible  and  would be  a  distinct  improvement  of the
current state of affairs. 

The applications  of such a  language system  extend beyond simply  a
programmer's  aide.   Such  a programming  environment  should be  an
invaluable aid to  students learning the  language.  Construction  of
programs  in the  careful  manner  which  this lab  supports,  should
reinforce  the concepts of  structured programming.   the addition to
the basic lab of a  Monitor-Helper module could bring Computer  Aided
Instruction  in   programming  languages   far  beyond   the  current
"programmed textbook" approach. 

Now to add more detail to the preceding ten points. 

(1)  The specification language -- First,  the language presented for
the  description  of  the  algorithm  and  data  structures  must  be
approachable.  It  should  include  facilities  for  describing  data
structures.  The description facility  should be general and easy  to
use.    It  should  also  allow  the  user  to  specify  input/output
conventions for his data structures.  As an accompanying feature, the
language  should  contain  strong  typing.    Automatic   coercion  is
unacceptable, but  again the user  should be allowed to  specify type
conversion  conventions.   Thus execution of  any computations should
require type checking.   A significant  number of programming  errors
are the result of computations with mismatched types. 

Currently the  best manifestation of user-defined  data structures is
EL1, though the ideas are present in LISP, and have been rediscovered
in the interim by many people. Wirth, Hoare. 

A proper  characterization of an  abstract data structures  entails a
discussion  of   abstract  control  structures.    That  is,  control
structures present  in  a  programming language  are  concrete;  they
operate on the concrete data structures of the language. If we are to
separate  data  structures from  their  representation, then  we must
likewise  be  able  to  separate  control  from  acting  on  specific
representations. For example, if we wish to use set-valued variables,
then we most likely  will want the  set-membership relation; when  we
wish finally to map sets onto a data structure,  we will need a means
of characterizing membership in terms of existing control structures.  

Working will need to be done to give adequate axiomatic semantics for
both data structure and control structure definitions. 


(2) Second, in constructing the  program the user should at least  be
provided with  an understanding editor  in the spirit INTERLISP.   An
understanding   bookeeper  is   also  desireable   as  part   of  the
construction phase.   At  the most  trivial level  this observer  can
recognize  clashes  of  function  names,  incorrect  applications  of
defined  functions,  and   in  a  typed   language,  could  warn   of
discrepancies in  the types of  actual parameters,  formal parameters
and values. 

Consensus  says  that  the INTERLISP  system  is  the best  available
programming environment.  We can  gain much from that experiment  but
significant improvement can be made.  The major failings of INTERLISP
stem  from its  teletype-oriented ancestry.  No sophisticated on-line
system should  be predicated on  such simple input and  output.
One  need only  compare TVEDIT  or  E with  SOS or  TECO,  or contrast
debugging in RAID with debugging with DDT.  So even at this  very low
level of program  preparation much can be done.  Formatting programs,
like  LISP's PRETTYPRINT  can be  used to present  the program  to an
E-like editor. The transformations to and from the  internal abstract
syntax tree can all be  handled automatically.  The syntax tree would
never be seen by the programmer. 

***give details***

(3) To experiment with a program or partially constructed program, we
require an evaluator for expressions in the language.   The evaluator
should allow  partial evaluation, handling calls to  as yet undefined
programs segments, or as yet unbound variables. Certainly part of the
experimentation  process  will  involve  the  substantial  use  of  a
powerful   simplifier    in   manipulating   programs   by   symbolic
interpretation and in simplification of verification conditions. 

A conceptually simple way  to approach symbolic interpretation is  to
extend the range of program variables.  All languages allow variables
ranging  over data domains; several languages  cater to procedure (or
function-valued) variables.  The study of  symbolic interpretation is
readily  approached  by  allowing  expression-valued variables.    No
language contains  a  systematic  study  of  such  variables,  though
REDUCE's symbolic  mode is a  very simple incomplete  treatement, and
the work of Boyer-Moore is essentially an attempt at a theorem-prover
over such  a  symbolic domain.    Adding expression-  or  form-valued
variables   as  an   integral  part   of  the   language  design   is
benifcial(***why***).   Simplification of the  results of the partial
evaluations is obviously necessary.  The pragmatics of such languages
are an extension of sub-tree replacement systems.  Whether good sense
can be made of their semantics, is an open question. 

Several good symbolic manipulation systems are available, and one  on
them, REDUCE, has been used in a verification system. 


(4) Debugging  -- Non-trivial programs  cannot be expected  to spring
bug-free  from the mind  of the creator.   It is  difficult enough to
write  a  program   with  no  syntax  errors.     Debugging  is   not
counterproductive, but should be a useful tool in obtaining a correct
program.    Program debugging  can  involve either  execution  of the
program on examples or symbolic evaluation of  the program, comparing
the  behavior to  that  expected by  the described  algorithm,  or an
atttempt to verify the  correctness of the  program using the  formal
methods.  Program bugs arise from at  least two recognizable sources:
(1)misunderstanding  of   the  basic  algorithm  and  (2)  errors  in
representation.  Type  2 bugs can  be minimized by  use of a  careful
data-structuring representation which we will advocate.  Indeed, many
non-numerical problems  -- compilers, interpreters, etc.  -- are best
described as very  simple algorithms operating  on very complex  data
structures.  This view  has  several important  implications:  with a
simple algorithm the structure of the process is more transparent and
less suceptible  error (type-1  bugs); making  a concerted effort  to
separate to separate  what is static --data structures-- from what is
dynamic -- program-- has great conceptual advantage. 

Type 1  bugs can be  mollified by  presenting the  user with  control
structures which are close  to the intuitively understood algorithm. 
An analysis  of type-1 bugs is quite difficult, going far beyond what
we propose here. A system  proporting to handle such bugs  would have
to  know much about  the program and  the area  in which the  user is
working. 

Type-2 bugs are much more tractable and comprise the majority of  the
present-day headaches.   We can  reasonably assume  that the user  is
sympathetic and  is attempting to construct a  correct program; he is
not attempting to  be obfuscatory.  if we can  give him  sufficiently
high-level  programming  constructs,  then  we can  assume  that  his
program  is "almost  right".   His confusions  are due  to reasonably
low-level errors.  Until we can cope successfully  with these simpler
errors, it is doubtful that we can successfully tackle type-1 bugs. 

Obviously one of the difficulties in debugging is assigning the blame
when a program does not perform as expected. The manifestation of the
error may have been  caused by some distantly related  action. How to
present the  user with cogent information must  be investigated.  The
inadequacies of debugging a  high-level program at the machine  level
are well known. Attempting to debug large programs using verification
conditions is also fraught with peril. 

Discover of  either type bug will involve modification of the current
program. 


(5)  Program modification  involves  program  construction  but  also
should require a  bit more.  Modification of  a program will commonly
have an effect on the input/output conditions which currently  hold. 
Indeed the reason  for a modification excited  by a bug is  to change
the current  i/o conditions to conform to  those which are expected. 
However, program modifications may also affect other i/o assertions. 
We should  be able to tell  a user something about  the global effect
which local modification induces on the program. 

(6)verification -- As previously mentioned, simplification techniques
will play  a crucial  role here.   We  should also  make a  concerted
effort  to develop techniques  for higher level  verification, rather
than expect  proofs to  be  carried out  from  first principles.    A
beginning is  the recent work  of Suzuki; see  section(7).   Also the
introduction of  user-defined data structures has a beneficial effect
in allowing  verifications on the  properties of  the data  structure
rather  than on  the properties  of  the implementation  of the  data
structure. 


(7)improvement  --  One  necessary  component  of  this  comtemplated
programning  environment must  involve  changes  of  representation.  
Having established the correctness  of an program which uses abstract
data structures and control structures  we must be able to  transform
this program  into an  equivalent one  which operates over  different
data structures and/or different control structures. 


The  most well-known examples of  program improvement involve changes
in control structure, replacing  recursion with iteration.   However,
there  are   equally  troublesome   problems  in  changing   of  data
structures.    A  good  example  of  data  structure manipulation  is
contained in Floyd's TREESORT3.  In atempting to  decode the behavior
of  the program SIFTUP  in TREESORT3,  it soon  is apparent  that the
program complexity derives not from the complexity of the  algorithms
but form the intricacies of the encoding of a binary tree as an array
with an associated technique for finding successors in the tree. Once
the "real behavior" of SIFTUP is discovered, then giving a convincing
proof of  SIFTUP's correctness is  feasible.  That  correctness proof
might  involve a proof from  first principles as  done by Morales, or
might involve a more  intuitive verification using macros as  done by
Suzuki.    However,  what   is  really  desireable  is  to  push  the
verification process back even further tha Suzuki. We should be  able
to describe SIFTUP as a simple (recursive)  algorithm operating on an
abstract data  structure; we should expect to  be able to verify that
this abstract SIFTUP satisfies the necessary I/O relations;  and then
we should ask  the system to (1) recode SIFTUP  without recursion and
(2)  ask the system  to map the abstract  tree-representation onto an
array  repesentation  with  a  specified  successor  relation.    The
"natural"  abstract  SIFTUP  is  quite  straightforward, and  a  hand
simulation of  the  transformations  involved  in  obtaining  Floyd's
SIFTUP  and Hoare's  FIND  gives  encouragement for  this  approach.  
Indeed, the automatic, or semi-automatic improvement of programs is a
natural domain  for  program manipulation  systems.   The  programmer
wishes to state is problem in its most natural formulation; given the
relations  between  the  programmer's  representations  and  the data
structures and control structures of the programming lab, much of the
work  involved in  reconcilling the  two can  be  done mechanically.  
Indeed we  can  envision  a user  describing  an  abstract  algorithm
operating  on abstract  data structures  and  control structures;  he
should  be able to  verify the correctness  of the  abstractions to a
convincing  level;  and  then   could  use  the  system  to   perform
transformations on both the abstractions  nad the proofs to hopefully
obtain  a lower-level program complete with corresponding correctness
proof. 


(8)compilation --  The presence  of a  compiler is  perhaps the  most
peripheral of the components. Compilers add nothing to the conceptual
scheme of  the programming  lab; they  are obviously  useful when  we
finally wish  to execute a  program at maximum  speed.   Indeed rapid
execution can make the difference between an acceptable program and a
theoretical solution.   However  as far  as  construction of  correct
programs is concerned,  we should have done all the  hard work before
we get to the compiler.  The creativity resides in the other parts of
the system. 

Techniques of program correctness and verification  are applicable to
the generation of  more efficient code.  Clearly  if the compiler can
show that  our program  is  equivalent to  one which  generates  more
efficient code, then  that is all  to the good.   Indeed many  of the
current   optimization  techniques   are  simply   manifestations  of
equivalences which were recognized by the compiler writer. The use of
symbolic evaluation and simplification can  also be exploited here to
generate more efficient code. 


(9)execution   --   Execution  of   program  modules   is  reasonably
straightforward.   We must  note that  during execution  we will  see
compiled code, interpreted code, partially-specified code, and almost
assuredly we will see bugs. The control programs must cater for these
eventualities.  Communication  between interpreted and  compiler code
should  be  smooth;   if  the  output  behavior  of  a  unwritten  is
sufficiently well specified we should  be able to continue as if  the
program  were written.   Bugs  which  occur in  interpreted code  can
certainly be  presented to the user, while illegal attempts to access
either program or data from a compiled module must be foiled. 

(10)apg  --  Certainly  the  programming  problem  would  be  greatly
simplified if we could  specify our algorithm in a suitable formalism
and expect  a  Program  Generator  to  provide  us  with  a  suitable
implementation which  was guaranteed to  be correct.   Much work  has
been  done  in  this  field  and  some  non-trivial programs  can  be
generated  from  formal  specifications.    Even  if  we  allow  some
user-interaction to help guide  such programs, current methodology is
not  developed  to  a  point were  we  can  expect  large complicated
programs to  be generated  automatically.   Sound  work on  automatic
programming should continue but  it would be premature to expect such
projects to make this proposed lab obsolete in the near future. 


Rather  than   dilute  our   research   effort  by   undertaking   an
implementation  of a  full-fledged language  with  all its  necessary
support programs, we  are convinced that a satisfactory system can be
built on a currently implemented language. Thus all of  the necessary
components of the language laboratory can  be written in a high level
language.    Once  the  structure  of  the  control  program and  the
programming language  are settled, each  of the components--  editor,
evaluator, compiler, etc.-- can be programmed as a distinct unit. 

What can  we say about the  structure the programming  language?  For
descriptive   power,   easy   of   modification,   portability,   and
extensibility it is necessary that the language be able to operate on
an abstract data structure representation (parse tree) of programs. 

This  data  structure  representation  then  is  valid input  to  the
interpreter,  the  compiler,  the  editor,  the  debugger,   and  the
simplifier.  Our experience with the programming language LISP (which
is  a data  structure representation for  M-expressions) confirms the
efficacy of  this  approach.   The  most successful  enviroments  for
program development  have been LISP-based.  The  success derives from
the natural represntation of program-constructs as data items. 

Programming languages which allow a natural representation of  prgram
components as data items we will call LISP-based.  Obviously the word
"natural"  is open to interpretation.   Thus, although we  can map an
Algol program onto  a representation as an  Algol data object,  (most
likely a complicated  interrelated collection of arrays) it  is not a
natural  mapping.    The  recognition  of  the  power resulting  from
representing program as data in a high level language is as important
as the ability to perform that map.  LISP was the first such language
to exploit this idea, and thus deserves the credit. 


Thus a LISP-like language is essential.   Whether that language is  a
dialect of LISP, Pascal, or ALGOL68 is irrelevant.  What is important
is  that we  have  the ability  to manipulate  a  natural, structured
representation of programs  written in that language.   To the  usual
argument  that manipulation  and,in  particular,  evaluation of  such
representations  of programs  is inefficient  and slow, we  need only
remark  that  the slowness  and  gross  inefficency  of  the  current
debugging technology is much  more excessive.  A faster machine would
allow more rapid  interpretation of programs,  but machine speed  has
had little effect  on correct program  construction.  It  seems quite
peculiar to hear people complain about the interent inefficiencies of
interpretation, while spending months to debug their programs.   This
proposal is an  attempt to address the conceptual  inadequacies.  So,
disregarding syntactic considerations, what essential features should
the language provide the user? 

Certainly some means  of granting user-defined  data structures is  a
necessity. Hoare has remarked  that data structuring can be condensed
to manipulations  of  either  sets  or  sequences.    With  a  slight
modification, we believe  that this is  a reasonable first step.   An
initial  class  of  constructors of  data  structures  which we  call
context free is easily available. Basically, abstract representations
are context free if their concrete  syntax is specifiable in BNF.  We
will also grant simply semantic qualifications as attributes attached
to the  BNF.  Given  the BNF for  a representation  there is a  easy,
natural map  onto sequences and sets.   This is not a  trivial set of
data structures; most typical structures  fall into this class.   And
since we  believe that  most programming  projects in data  structure
manipulation  are   actually  very  simple  algorithms  operating  on
complicated data structures, the class of programs which we can write
is also  non-trivial.   Separating out the  data structures  from the
control component of a program has at least two benefits. 

First,  from a purely  practical view, the resulting  program is much
more readible.   It is also  much more difficult  to write a  program
which takes unfair advantage of the data representation. 

Second,   the  implicit   relationships  between   the  constructors,
selectors, and recoginzers attached  to the data structure,  manifest
themselves as axioms of our formal system. This will have benefits in
the  verification phases of the  program generation.  That  is we can
verify program  properties using our  higher level  knowledge of  the
data structure rather than being forced to estavlish our proofs using
properties of the particular representation. As long as we are unable
to refer to underlying  representation our high level  reasoning will
be secure.  The benefits  acruing to such abstractions are well known
at the programming level.  The  separation of the algorithm from  the
particular data representation is one of the principles of structured
programming, extending back  to the earliest papers of J.  McCarthy. 
Abstract  data  structuring  is  simply  McCarthy's  abstract  syntax
appearing in  s slightly different  guise.  Abstraction,  we believe,
should  give a comparable  gain in verification efforts.   Ideally an
abstraction should form an intuitively acceptible  "basis" [luckham &
von  Henke] from  which the  system can  be guided  to a  correct and
efficient concrete program. 

We can now give a reasonably accurate description of the mechanics of
the proposed  system.   The input  to the  system, both programs  and
commands will  be parsed to an abstract  syntax representation by the
PARSER.  When output is  requested by the user, either explicitly  or
implicitly,  the UNPARSER  will perform  the inverse  transformation,
presenting readible output. All other components of the system -- the
command interpreter, the editor, the form  evaluator, the simplifier,
the debugger, the  verifier, and the compiler --  will all operate on
this abstract syntax tree. 

As outlined, the lab is a complex, long-termed project.  Many of  the
components  have been  studied  separately;  it  seems high  time  to
attempt  to build  a cohesive system,  for only  in that way  will we
really find out  how much we  know and how  much work  is left to  be
done.  Which of the desiderata are realistically attainable?  What is
the current state of development of the components of such a system?  

First,  once the basic structure  of the system  is designed, each of
the components can be attacked as a  reasonably self-contained unit. 
Also because of the  LISP-like orientation of the language, debugging
and modification of the system should be more tractable. Indeed,  one
of the  first  experiments should  be a  verification  of the  system
itself. 

Since significant research  has been done in most, if not all, of the
ten areas, progress to  an initial system should  be rapid.  What  is
clearly  missing  in  current  efforts  is  a  concerted  attempt  to
implement  a verification  based language  lab to  see if  indeed the
formal techniques can truly become a programmer's assistant. 

language choices: lisp el1 pascal

lisp needs superstructure of d.s. editor debugger verifier

el1 needs verifier rest unk

pascal needs runtime? interpreter;d.s. 

which parts are expendible compiler apg

add more  about lisp-like:  closures therefore no  extra probs  about
fancy control: it's there

WHY NOT OTHER LANGUAGE

compiler-based language is inadequate
problems 
	debugging in machine lang
	what does editor operate on?
	verifier must operate on other repesentation of prog
	simplifier must operate on something else

add sdio